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In this paper, we propose the amphibious influence maximization (AIM) model that combines traditional marketing 
via content providers and viral marketing to consumers in social networks in a single framework. In AIM, a set 
of content providers and consumers form a bipartite network while consumers also form their social network, and 
influence propagates from the content providers to consumers and among consumers in the social network following 
the independent cascade model. An advertiser needs to select a subset of seed content providers and a subset of seed 
consumers, such that the influence from the seed providers passing through the seed consumers could reach a large 
number of consumers in the social network in expectation. 

We prove that the AIM problem is NP-hard to approximate to within any constant factor via a reduction from 
Feige’s fc-prover proof system for 3-SAT5. We also give evidence that even when the social network graph is trivial (i.e. 
has no edges), a polynomial time constant factor approximation for AIM is unlikely However, when we assume that 
the weighted bi-adjacency matrix that describes the influence of content providers on consumers is of constant rank, 
a common assumption often used in recommender systems, we provide a polynomial-time algorithm that achieves 
approximation ratio of (1 — 1/e — e)® for any (polynomially small) e > 0. Our algorithmic results still hold for a more 
general model where cascades in social network follow a general monotone and submodular function. 

Categories and Subject Descriptors: G.2 [Mathematics of Computing]: Discrete Mathematics; G.3 [Mathematics 
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1. INTRODUCTION 

Marketing is traditionally partitioned into several stages: advertisers pay content providers 
(e.g. TV networks, radio stations, online news sites, influential bloggers, etc.); content 
providers recruit audience; and then the audience who are exposed to the advertisements in¬ 
fluence their friends. Today, with the development of the Internet and social networks, there 
is an enormous amount of data that can be used to predict which users will enjoy a specific 
content, which users are likely to purchase the advertised product, and which users can in¬ 
fluence their friends to buy the product as well. More importantly, information is available to 
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track the individuals who participate in each one of those interactions. This suggests a new 
marketing approach in which advertisers can contact hoth content providers and the audience 
at the same time, with the goal of maximizing the overall exposure (through direct exposure 
as well as propagation via social networks) to the advertisement. 

Consider the following example. Suppose a technology company wants to select a subset 
of regular tech hloggers (content providers) and engage them with marketing activities so 
that they would cover the company extensively and favorably. However, this alone does not 
guarantee that these favorable blogs can reach the targeted customers of the company. The 
company may further select a number of non-bloggers and spend its marketing effort on them 
(e.g. buying advertising slots to remind them about the blog entries of their selected bloggers) 
to make them active in subscribing, reading, and propagating the blog entries written by 
the company’s selected bloggers. The objective of the company is to maximize the number of 
targeted customers who get exposed to the favorable blogs, either directly or indirectly through 
links forwarded by friends in the social network. 

The above proposed marketing strategy can be viewed as a combination of traditional mar¬ 
keting via content providers and viral marketing in social networks. It can be modeled as 
a controlled diffusion in a joint network consisting of a bipartite graph modeling provider- 
consumer relationship and a social graph modeling social influence relationship among the 
consumers. The bipartite graph and its edge weights indicate the influence from content 
providers to consumers, while the social graph and its edge weights indicate the influence 
among consumers. An advertiser wants to select a subset of content providers (called seed 
providers) and a subset of consumers (called seed consumers) in the social network such that 
the influence from seed providers could activate enough seed consumers, which in turn could 
activate more consumers in the social network. Since the two marketing activities involve 
costs of different types, we enforce separate budgets on provider selection and consumer se¬ 
lection. 

In this paper, we model the above combined marketing strategy as the following amphibi¬ 
ous influence maximization (AIM) problem. We are given (a) a bipartite graph B = ((/, V, M) 
where U represents content providers, V represents consumers, and M is the weighted bi¬ 
adjacency matrix representing the influence probabilities from providers to consumers; and 
(b) a directed social graph G = (V, P) where V is the same set of consumers as in B and P 
is the weighted adjacency matrix representing influence probabilities of each consumer over 
her friends. Given a subset A c (7 of seed providers and a subset Y (-V oi seed consumers, 
the influence propagates from X to Y and then to other consumer nodes in V following the 
independent cascade model [Kempe et al. 2003]. Given budgets bi for providers and &2 for con¬ 
sumers, the AIM problem is to select at most bi seed providers and 62 seed consumers such that 
the expected number of activated consumer nodes after the diffusion process is maximized. 

One important nature of the AIM problem formulation is that seed consumer selection is 
non-adaptive. That is, we need to select seed providers and seed consumers together before we 
observe the actual cascades from the seed providers. This is motivated by long-term marketing 
campaigns, during which repeated cascades may be generated from content providers. For 
such campaigns it is impractical for advertisers to adaptively select seed consumers for every 
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cascade, and thus non-adaptive seed consumer selection aiming at maximizing the cumulative 
effect over multiple cascades is desirable. 

Our results 

We study hoth the hardness of the AIM problem and its approximation algorithms in the in¬ 
dependent cascade (IC) model [Kempe et al. 2003]. In terms of hardness, we warm up (Section 
3) with an easy result that finding any constant-factor approximation for AIM (even when the 
social network graph has no edges at all) is as hard as approximating the densest-fc-subgraph 
problem, for which no polynomial-time algorithm is known. Our main impossibility result 
(Section 4) is that AIM is also NP-hard to approximate to within any constant factor. The 
result is proven by a reduction from Feige’s fc-prover proof system for 3-SAT5 [Feige 1998]. 

In order to overcome the above strong inapproximability results, we introduce additional as¬ 
sumptions in our model. Both hardness reductions construct a providers-consumers bipartite 
graph B with a complex and elaborate structure. In practice, even if the true relationship is 
indeed so intricate in nature, most of the learning techniques that are used to estimate this re¬ 
lationship assume some simple underlying structure - so we can expect the input for our algo¬ 
rithm to be "simple". In particular, for the specific motivation of influence of content providers 
on consumers, a common assumption in the construction of the influence matrix is that it is 
(approximately) low-rank (e.g. the "Netflix Problem"; [Keren et al. 2009]). This assumption is 
typically motivated by modeling the relationship between content and consumers via a small 
number of (hidden) features. In Section 5 we show that when the weighted bi-adjacency ma¬ 
trix P has constant rank, we can approximate AIM to within a factor of (1 — 1/e — e)^ in 
polynomial time for any (polynomially small) e > 0. Our algorithmic result can be generalized 
to accommodate any diffusion model in the social network that has a monotone, submodular, 
and polynomial-time computable influence spread function. 

1.1. Related work 

Influence maximization is first studied as an algorithmic problem with application to 
viral marketing by Domingos and Richardson [2001]; Richardson and Domingos [2002]. 
Kempe et al. [2003] first formulate it as a discrete optimization problem. They summarize 
the independent cascade model and linear threshold model, and apply submodular function 
maximization to obtain approximation algorithms for influence maximization. Extensive re¬ 
search has been done since to improve the scalability of the algorithm, extending the model to 
competitive setting, etc. (c£ [Chen et al. 2013]). 

Conceptually, amphibious influence maximization combines viral marketing with tradi¬ 
tional marketing via content providers, and thus it enriches viral marketing and its technical 
formulation of influence maximization to a new level. Technically, AIM also contains influence 
maximization as a special case: when we have provider budget bi = \U\ (allowing all providers 
to be seeds) and bi-adjacency matrix to be all-one matrix (providers would deterministically 
activate all seed consumers), AIM is reduced to the classical influence maximization problem. 

Recently, Seeman and Singer initiated a line of works [Seeman and Singer 2013; 
Badanidiyuru et al. 2014; Rubinstein et al. 2015] on adaptive seeding in social networks that 
is closely related to ours. In the adaptive seeding problem, a small subset X of the nodes in a 
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social network is initially available to an advertiser. In the first stage, the advertiser selects 
(or seeds) a subset S of these nodes, who may influence some of the neighbors. In the second 
stage, a random subset of the neighbors of S becomes available; the advertiser spends the 
rest of her budget on seeding a subset of the newly available nodes, in hope to maximize their 
influence in the social network. The most important difference between Seeman and Singer’s 
model and ours is that in the former, the seeding is adaptive, i.e. the advertiser waits to see 
which of the second layer’s nodes became available before selecting a subset. Recall in our 
model, per contra, the advertiser must seed consumers in advance; in particular, there is no 
guarantee that after the edge percolation, a seed consumer y G Y will have live edges with 
seed content providers. As already discussed, non-adaptive seeding is appropriate for market¬ 
ing campaigns during which repeated influence cascades may occur. 

From a technical viewpoint, although we certainly build on ideas from [Badanidiyuru et al. 
2014; Rubinstein et al. 2015], adaptivity completely changes the approximability of the prob¬ 
lem: all the works above achieve constant-factor approximations in different settings of adap¬ 
tive seeding, while we show that in the non-adaptive case, constant-factor approximation is 
impossible^. Interestingly, all the above works on adaptive seeding use a non-adaptive relax¬ 
ation of the adaptive problem. It turns out that unlike the non-adaptive AIM problem, the 
non-adaptive relaxation can be approximated efficiently to within a constant factor. (The pre¬ 
cise factor of approximation depends on other parameters of the problem such as IC model vs. 
a general submodular function.) 

The problem of acceptance probability maximization (ARM) for active friending studied by 
Yang et al. [2013] is also related to our work. In ARM, a source node needs to select k interme¬ 
diary nodes situated between the source and the target node in a social network, such that if 
influence from the source only propagates through intermediaries, the probability of activat¬ 
ing the target is maximized. AIM and ARM are similar in that both need to select some inter¬ 
mediary nodes between the source and the target and both study the non-adaptive version. 
However, their assumptions on influence cascade are different: ARM assumes that cascades 
only occur in the sub-network consisting of the source, the selected intermediaries and the 
target, while AIM assumes that cascades occur from the selected sources to the selected in¬ 
termediaries but from intermediaries cascades can reach the entire social network. For ARM 
problem, Yang et al. [2013] only provide a heuristic algorithm and do not have hardness of 
approximation results. In Appendix B, we build on the hardness of approximation of AIM to 
prove that it is N P-hard to approximate ARM in a general graph to within a near-exponential 
(2"^ ') factor. 

Finally, our algorithm for AIM with constant rank was inspired hy recent works that (ap¬ 
proximately) solve the Densest-fc-Bi-Subgraph problem in graphs with (approximately) con¬ 
stant rank [Alon et al. 2013; Rapailiopoulos et al. 2014]. 


^Note that this is a comparison of the algorithmic limitations within each model, and not a competitive analysis. 
In particular, whenever seeding adaptively is feasible, it is of course preferable and can perform much better than 
"non-adaptive seeding". As mentioned earlier, our motivation for studying a non-adaptive model is settings where the 
time required to estimate long-term influence of a marketing campaign makes adaptive seeding impractical. 
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2. MODEL AND PROBLEM DEFINITION 

We consider a (heterogeneous) network consisting of the following two components. The first 
is a bipartite graph B = {U,V,M), where U represents content providers (e.g. bloggers, TV 
programs, etc.), V represents consumers, and M is the \U\ x \V\ weighted bi-adjacency matrix 
with Mij G [0,1] denoting the probability that i € U would successfully activate j £ V (e.g. 
j is influenced by the advertisement associated with i). The second is a directed social graph 
G = {V,P), where V is the same as the V in the bipartite graph B, and P is the \V\ x \V\ 
weighted adjacency matrix with P^^ denoting the influence probability from v £ V to w £ V. 
We denote the set of directed edges of the social graph as E = {(r, w) \ P^u, > 0}. 

After fixing a set of seed providers X <£ U and a set of seed consumers Y C V, we model 
the influence diffusion from X to the nodes in the social graph G as follows. For each edge 
{i,j)inB we sample it as live with probability and blocked with probability 1 — Mij] for 
each edge iv,w) £ E, we sample it as live with probability Py^ and blocked with probability 
1 — Pvw We say that a node v £ V is activated (by the influence of X through Y) it there is a 
path {x,y,vi,... ,vt = v) with x £ X and y £Y, and all edges on the path are live. Given X and 
Y, we use a{X,Y) to denote the expected number of activated nodes in V (with expectation 
taken among all samples on all edges), and call it the influence spread of X and Y. 

Note that the diffusion model can be equivalently described as follows.^ First, every seed 
i £ X independently tries to activate every node j £Y with success probability , and j £Y 
is activated as long as some i £ X activates y, and nodes outside Y are not activated by seeds 
in X. Let S CY he the (random) set of nodes activated in Y. Then we treat S as the seed set 
and apply the independent cascade model [Kempe et al. 2003] to start the influence diffusion 
from S in the social network G using influence probabilities P: namely at each discrete time 
step, each newly activated node v £ V has one chance to activate each of its outgoing neighbor 
w £ V with probability Py^. 

Our goal is to find a set X of seed providers of size bi and a set Y of seed consumers of size 
62 such that they work together to generate the largest influence spread, which we formally 
define below. 

Definition 1 (Amphibious Influence Maximization). In the Amphibious Influ¬ 
ence Maximization ('AIMj problem, we are given a bipartite graph B = {U,V,M) and a 
directed social graph G = {V, P), and budgets bi and 62, and we want to find a subset X* C U 
of size bi and a subset Y* C V of size 62 such that the influence spread of X* and Y* are 
maximized, that is, finding X* and Y* such that 

{X* ,Y*) = argmax a{X,Y). 

XCU,\X\=bi,YCV,\Y\=b2 

Several remarks are now in order. First, when we set bi = \U\ and M as an all-one ma¬ 
trix, the AIM problem is reduced to the classical influence maximization problem defined 
in [Kempe et al. 2003]. Thus, AIM is a generalization of the classical influence maximization 
problem such that it considers interactions between the provider nodes U and consumer nodes 
V and they have to work together to spread the influence. Second, it is easy to see that when 


^Equivalence is in the sense of the distribution of final set of activated nodes in V. 
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either fixing set X or Y, a(X, F) as a set function of the other variable is monotone and suh- 
modular.® However, the interaction of X and Y makes the AIM problem much harder than 
the classical influence maximization problem: we need both nodes in X and Y to generate 
influence and missing either of them will not work. Finally, our results can be generalized to 
allow diffusion models in the social network to follow any monotone and submodular function, 
and non-seed consumers to be influenced with background probabilities. To simplify the pre¬ 
sentation, we focus on the main problem given in Definition 1 and discuss the generalization 
in Section 6. 

3. HIDDEN-CLIQUE HARDNESS 

Before we derive our main hardness result, we briefly describe in this section a much simpler 
reduction which gives a weaker hardness, "Hidden-clique hardness” (sometimes also "planted- 
clique"). Another feature of this result is that in the hard instance the social network graph G 
has no edges at all! 

Hidden clique. In an Erdos-Renyi random graph Q (n, 1/2) the largest clique size is approx¬ 
imately 2 log 2 n, with high probability (e.g. [Alon and Spencer 1992]). We can “plant” a clique 
of size t » 2 log n, by choosing t nodes at random, and connecting all the edges between them. 
The hidden clique problem (e.g. [Alon et al. 2011]) is to distinguish between a graph sam¬ 
pled from Q (n, 1/2) and a graph from Q (n, 1/2) with a planted clique. Alon et al. [Alon et al. 
2011] reduce this problem to solving the following gap version of DENSEST fc-SUBGRAPH. Al¬ 
though the planted clique problem has been extensively studied, the best known algorithms 
run in quasi-polynomial time ( 71 '^^'°®")); in particular, there are no known polynomial-time 
algorithms for the hidden clique problem. 

Theorem 3.1. (Theorem 1.3 of [Alon et al. 2011]) If there is no polynomial-time algorithm 
for the hidden clique problem with a planted clique of size t = then for any i5 > 0, there is 
no polynomial-time algorithm that given a graph G distinguishes between: 

Completeness. G has a clique of size k; and 
Soundness. Every k-subgraph of G has density at most S. 

Hardness for AIM follows as a corollary: 

Corollary 3.2. If there is no polynomial-time algorithm for the hidden clique problem 
with a planted clique of size t = then AIM cannot be approximated to within a constant 
factor in polynomial time - even in the special case where the social network graph has no edges. 

Proof. We give a reduction from the Densest fc-SUBGRAPH problem. 

Reduction. Given an instance G^^^ = {y^ks of DENSEST fc-SuBGRAPH with gap 
parameter 5, we construct an AIM weighted bipartite graph B = {U, V, M) between content 
provider nodes and consumer nodes as follows: We identify both U and V with the original 
set of vertices (i.e. our AIM instance has twice as many vertices). For any u G U and 


®A set function / is monotone if for all S CT, f{S) < f{T), and submodular if for all S C T and u ^ T, /(S U {u}) — 
/(5)>/(ruW)-/(T). 
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V € V, we set = l/'n? if the corresponding vertices in are distinct and have an 
edge between them, and „ = 0 otherwise. We set the budgets to 5i = 62 = k/2. 
Completeness. If contains a fc-clique G Q V, then partition G into two subsets of 

size^ k/2 and label them X' and Y'. Consider their respective copies X C U and Y CV in 
the bipartite graph: it follows from the construction that X U F is a bi-clique. Thus, every 
consumer in Y has probability 1 — (l — 1 = (1 — 0(1)) • k/ ( 2 n^). Summing over all 

k/2 consumers, the expected number of activated nodes is OPT = (1 — o(l)) • (4n^). 

Soundness. Let X, Y be an optimal solution of the AIM instance. Let S C be the 

union of the copies of X and Y in G^^^. By the premise, S contains at most 15 ( 2 ) edges. 
Thus there are at most 25 ( 2 ) edges between X and Y, each with weight 1 /n^ (we may count 
some edges twice in case the copies of their endpoints belong to both X and Y). Therefore, 
cr(A, Y) < Sk^/n'^ < 5,5 • OPT. 


□ 

4. NP-HARDNESS OF APPROXIMATION 

In this section we prove our main hardness result, namely: 

Theorem 4.1. AIM is HP-hard to approximate to within any constant factor. 

Proof outline. We reduce from Feige’s /c-prover proof system [Feige 1998]. The provers’ an¬ 
swers to questions correspond to the provider nodes in U. The provider nodes are connected to 
a subset Vi c V of the consumers, on which the verifier can test the provers’ answers. Since 
the edges from U to Vi appear with low probability, it is significantly more cost-effective to se¬ 
lect a few nodes from Vi with many neighbors in U. Intuitively, this corresponds to a verifier’s 
test which many provers would pass. By Theorem 4.2, if we start from a satisfiable formula, 
all k provers will agree - versus less than 2 provers that agree for an unsatisfiable formula. 
The rest of the consumers, V 2 = V\Vi, have incoming edges from influential consumers in Vi. 
They will guarantee that the provers answer (almost) all the verifier’s questions. 

Notice that our hard instance is a three-layered graph. We henceforth call the provider 
nodes the top layer, the influential consumers Vi constitute the middle layer, whereas the 
bottom layer has the rest of the nodes. 

k-prover proof system. Consider k provers trying to prove the satisfiability of some 3-SAT5 
formula over n variables. A 3-SAT5 formula is a conjunctive-normal-form (CNF) formula 
where each variable appears in exactly 5 clauses, and each clause contains exactly 3 variables; 
notice that there are 5n/3 clauses. The verifier selects I clauses (with replacement) uniformly 
and independently at random. For each clause, the verifier selects one of the participating 
variables uniformly and independently at random; we call those the distinguished variables. 
Each question consists of 1/2 clauses, and 1/2 distinguished variables from the remaining 1/2 
clauses. An answer a to question q consists of assignments to the 1/2 + 31/2 = 21 variables in 
question. Let R be the set of random strings, and Q be the set of questions. For each random 


^We assume without loss of generality that k is even. Given a polynomial time algorithm for even k it is easy to extend 
to an algorithm for k — 1; e.g. by adding a dummy vertex that is connected to all vertices in the graph. 
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string r e R, we associate a question q £ Q for each prover i £ [fc]; we denote this as {q, i) £ r. 
We henceforth abuse notation and also use R and Q to denote the corresponding cardinalities 
i? = • 5 ^ and Q = n* • (|) . Given the k provers’ answers, Feige’s verifier tests the provers 

answers hy comparing their answers on the I distinguished variables. (We will diverge from 
Feige’s construction at this point and use a stronger test that compares the provers’ answers 
on all 31 variables.) For constant I, Feige proves the following theorem. 

Theorem 4.2. (Lemma 2.3.1 in [Feige 1998]) Given a k-prover system, it is NP-hard to 
distinguish a 3-SAT5 formula between the following: 

Completeness, all the provers pass the verifier’s test with probability 1; and 

Soundness, the probability that any pair of provers pass the verifier’s test is at most for 

some constant c > 0. 

Construction. We construct a directed graph with three layers: U,Vi,V2. The first layer U 
is precisely the set of "content providers" in our model. The set of "consumers" populates the 
middle and bottom layers V — Vi U V2. In terms of the weighted bi-adjacency matrix, the 
layered structure means that = 0 for allu £ U and v £¥2, and similarly Pvi,v 2 = 0 unless 
vi £ Vi and V 2 £ V 2 . 

Going back to the fc-prover system, the top layer U corresponds to triplets of provers’ an¬ 
swers to questions; the middle layer Vi corresponds to assignments to variables -distinguished 
and non-distinguished- that may appear in the verifier’s question to any of the provers; finally, 
the bottom layer corresponds to the random strings of the verifier. All the edges go from the 
top to the middle layer, or from the middle to the bottom layer. In particular, the graph is 
tri-partite. 

More specifically, for each triplet {q, a, i) of (question, answer, prover) we have a correspond¬ 
ing node in U. For each pair (r,a[) of (verifier’s random string, assignment to all 31 variables) 
we have a node in Ifi. Notice that this is different from [Feige 1998], where the elements to be 
covered correspond to (r, 0 ^, 1 ) with Ur being the assignment only for the distinguished vari¬ 
ables. The {q,a,i) node is connected to all the nodes {r,a[) such that: {q,i) £ r, and when 
restricting ^ to the variables specified by {q, i), it is equal to a. In particular, for each i, each 
(r,a[) corresponds to only one (q, a, i). We set the top-layer budget to be the number of nodes 
in U that correspond to a single assignment, bi = kQ; similarly we let &2 = A represent the 
number of nodes in tfi that match the same assignment. Finally, all the edges from U to Vi 
have probability 1/fc. 

For each random string r, we have 77 nodes in the bottom layer, V2 • We choose a sufficiently 
large 77 to ensure that most of the utility comes from the bottom layer. The nodes corresponding 
to each r are connected to all the nodes {r,a[) in Vj with probability 1 . The role of this layer 
is to force any good assignment to spread its budget across the different random strings (i.e. 
make sure that the provers answer all the questions). 

See Table I for a summary of notation. 

Completeness. Given a satisfiable assignment to the 3SAT-5 formula, we select in the top 
layer a subset S c U of kQ nodes that correspond to the same assignment. Because they all 
correspond to the same assignment, for each random string r, all k corresponding nodes in S 
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Table I: Summary of notation in main reduction 


Notation 

Interpretation in /c-provers system 

Vertices in AIM 

{q,a,i) 

question, answer, prover 

1 vertex in U 

(qp) 

question, prover 

23^/2 vertices in U 

r 

random string 

k ■ 2^'/^ vertices in U 
(V {q, i) G r and a G {0, 

{r,aT) 

random string, assignment to all 3/ variables 

1 vertex in Vi 

r 

random string 

77 vertices in V 2 


random string, copy 

1 vertex in V 2 


are connected to the common node (r, a^*). In the middle layer, we let T be the set of these 
R nodes (i.e. (r, oT*) for r G R). Before sampling the edges, each (r,aT*) has k neighbors in S. 
After sampling, the probability that there is a path from S to (r, oT*) is 1 — (l — -1) Ri 1 — 1 /e. 

Since each node in T has rj neighbors in V2 (with probability 1 ), the value of this solution is 
approximately OPT « (1 — l/e)i??7. 

Soundness. In an unsatisfiable instance, any two provers agree for at most a ( 2 “'=')-fraction 
of the random strings. We will show in Lemma 4.3 that there are at most (2 • • i?) 

good random strings r, which are strings r such that there is a node {r,a^) with more than 
one neighbor in S. Since for each random string r there are only 77 nodes in V2, each of the 
good random strings contributes at most rj to the value of the solution. Before sampling the 
edges, any node that does not correspond to a good random string has at most one neighbor 
in S. After sampling, the probability that any such node has a neighbor in S is at most 1/fc. 
Since each node in Vi has rj neighbors in V2, the total contribution from R nodes that do not 
correspond to good random strings is bounded by Rrj/k. Therefore, the expected number of 
covered nodes is bounded by the contribution of the middle layer, plus the contributions from 
the good and bad random strings: 

i? -k (2 • • i?) r? -k Rfi/k ={l/k + o (1)) Rr] r; OPT. 

Lemma 4.3 below completes of Theorem 4.1. 

Lemma 4.3. There are at most (2 • good random strings. 

Proof. Intuitively, any (r, oT) which has more than one neighbor in S corresponds to an 
agreement of at least two provers - and therefore should be a rare event. In order to turn 
this intuition into a proof, we must rule out solutions that distribute the budget in an uneven 
manner that does not correspond to answers of honest provers to verifier’s questions. 

In expectation, for each {q, i) there is only one {q, a, i) G S. Therefore by Markov’s inequality, 
for at most a 2“(^/^)‘^Lf]-action of (g, i)’s, more than corresponding nodes belong to S; we 

call those (g, i)’s heavy, and light otherwise, i.e., 

Pr : {q, i) is heavy] < 

r 


(1) 
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We henceforth focus on hounding the number of good random strings that correspond only to 
light {q, i)’s. 

Consider only r’s whose (g, i)’s are light. For each light (g, i), there are at most nodes 

(g, a, i) in S. In other words, each prover submits at most answers to each question. 

By Theorem 4.2, if each prover submits only one answer to each question, the fraction of 
random strings for which at least one pair agrees is at most 2“'^*; having answers, the 

probability that any pair agrees increases by at most 2^^/^^°^ Therefore at most a 2 “^^/^)'^^ 
fraction of random strings have at least one pair of agreeing answers. Recall that a random 
string r is good if for some the node (r, has more than one neighbor (g, a, i) in S. 

Pr [r is good and Vi : (g, i) is light] < 2“^^/^^'^^ (2) 

r 

Summing with (1), we have that: 

Pr [r is good] < 2 • 2“^^^^^'^^ (3) 

r 

□ 

5. ALGORITHM FOR CONSTANT RANK WEIGHTED BI-ADJACENCY MATRIX M 

The previous sections show that the AIM problem for general bipartite graph B and social 
graph G is hard to approximate to within any constant factor. In this section, we restrict the 
(weighted) bi-adjacency matrix M between content provider nodes and consumer nodes to be 
of constant rank r, and show that for this case we can obtain a constant factor approximation 
in polynomial time. We denote this restricted problem AIM-r. Our main algorithmic result is: 

Theorem 5.1. For any constant r > 0 and > 0, AIM-r can be approximated to within 
(1 —1/e—e)^ with probability 1 —d and in time polynomial in n, m, A, 1/e, log(l/(5), where n = \U\, 
m=\V\ and A is the maximum number of bits in any entries of matrix M. ® 

For any fixed X, Y) is a monotone submodular function of Y. Similarly, for any fixed 
Y, a{X, Y) is a monotone submodular function of X. Each of those can be (approximately) op¬ 
timized independently, thus the main algorithmic challenge is due to the interaction between 
the choice of X and the choice of F. Intuitively, a constant rank bi-adjacency matrix creates 
an "information bottleneck" which restricts the complexity of this interaction. 

How can we use the restriction on the matrix rank to optimize a non-linear objective? To 
this end, we introduce in Subsection 5.2 a relaxation of our objective function which conve¬ 
niently views M as a linear operator acting on X. Because M has constant rank, the result¬ 
ing subspace has a constant dimension; in Subsection 5.3, we show that we can efficiently 
(approximately) enumerate over all the points in this subspace. Finally, given the (approx¬ 
imately) optimal choice of X, we can use standard submodular maximization techniques to 
(approximately) optimize over Y (Subsection 5.4). 


®The running time is exponential in r. See Section 5.5 for more details. 
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5.1. Notation 

Henceforth, we use the following notational conventions. For vector x, Xi is the i-th element 
of X. All vectors are column vectors (unless otherwise stated). Let \U\ = n and \V\ = m. When 
the context is clear, we also use the index set [n] = {1,2,..., n} to represent U and index set 
[m] = {1,2,..., to} to represent V. 

Given the provider seed set X C U and the consumer seed set V C V, for convenience we 
denote x, y as the indicator vectors of X and Y, respectively. An indicator vector for a subset X 
of [/ is a vector in {0,1}" such that the entries corresponding to nodes in X are I’s and nodes 
in {7 \ A are O’s, and indicator vector for subset Y of F is defined similarly. 

Given x and y, we use fj (x, y) to denote the initial activation probability of each node 
j G V, which is the probability that some node i G X activates j based on matrix M, i.e., 
/j(x,y) = Vi - n*e[„] (1 - denote f(x,y) as the vector (/i (x,y)(x,y))^. 

Note that the initial activation of nodes in Y from nodes in X are mutually independent for 
every node in Y. Moreover, the initial activation probability of node j G F is not its final acti¬ 
vation probability, which is the probability that node j is activated by the end of the diffusion 
process, since j may be later activated by other nodes in V through the diffusion process in 
the social graph G. In particular, a node j gV\Y has zero initial activation probability by our 
model definition, but its final activation probability may be greater than zero. 


5.2. A concave relaxation 

A key step in our algorithm is to approximate every coordinate fj (x, y) via the following con¬ 
cave relaxation® 

Fji^x) =yj . 

Notice that this relaxation has two important features: (a) it is a function of the linear form 
(x^M)^ , which allows us to use the constant rank condition; and (b) it is both concave in x for 
a fixed y, and concave in y for a fixed x — this will make it much easier to maximize efficiently. 
Now we will show that it is a (1 — l/e)-approximation of /j (x, y) by the following lemma. 


Lemma 5.2. For any x,y g {0,1}", 

(1 - l/e)/j(x,y) < Fj(x,y) < /j(x,y),Vj = 1 ,...,to. 
Proof. For the right inequality, since > 1 — a for any real a, we get 


Fji^,y) = yj ( 


1 — e 


-Ei, 


iMi. 


< Vj 


1 - 


Y[{l-x^Mij) =/j(x,y). 


® The relaxation is inspired by [Badanidiyuru et al. 2014], Essentially the same relaxation was also used before by 
[Dughmi et al. 2011] in the context of Poisson rounding. 
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For the left inequality, because (1 — e ^)a<l — e “ holds for any a G [0,1], we have 


i-l 


(1 - e 1 - n = X! ^ n “ xkMkj) 




i-l 








2-1 


= E E ^kMkjj - exp 1^- ^ XkMkjJ j 

= X — e~ ^'^'3 

= 1 _ e '• G. 

Multiplying on both sides of the above inequality, we get (1 — 1/e)/j(x,y) <Fj(x,y). □ 


5.3. Approximating initial activation probability f(x, y) via (1 + e)-net construction 

Notice that the value of Fj (x, y) is uniquely determined by x^M and y. We use Im M to denote 
the image of M when M is treated as a linear operator from {0,1}" to R™, i.e. ImM is the 
subspace of all vectors x^M G M’”, Vx G {0,1}". Recall that since M has constant rank r, the 
dimension of ImM is also r. 

Our next goal is to enumerate over (approximately) all feasible s G ImM. For any e > 0, we 
say that set C R"* is a multiplicative-{1 + e)-net for M, if for every x G {0, 1}", there exists 
a corresponding point s = (si, S 2 , • • • , G Se such that for each coordinate j G [to], 

Sj < (x^M)^. < Sj(l + e). (4) 

Henceforth we drop the multiplicative qualification and simply call such a set a (1 + e)-net. 

Lemma 5.3. Let M g [0, l]"^™ be a matrix with constant rank r, and whose entries can be 
represented with A bits. Then for any error parameter e > 0, we can output a polynomial-size 
(1 + e)-netfor M in time poly (n, to, 1/e, A). 

Proof. Below, we show how to construct a weak {l-\-e)-net (in Algorithm 1), which instead 
of Equation (4) gives the following weaker, two-sided error guarantee: 

Sj/{l + s)<{x^M)^<Sj{l + e). (5) 

Given such an algorithm, one can construct a (1 -l-e)-net (with one-sided error) by constructing 
a weak Vl + e-net, and dividing every entry in the obtained weak y'l -i- e-net by Vl -L e. 

Since each entry of M G [0, l]"xm most A bits, then for any x G {0,1}", every nonzero 

entry of xJM is bounded in [2“^, n].^ Thus in each dimension, we lose no more than a (1 + e)- 
factor by only considering each position in 5 a = |0, 2-^ 2-^ • (1 + e), 2-^ • (1 + e)" ,..., n|. 


^This step assumes of course that all the entries in M are positive. While this is a natural for an adjacency matrix of 
a social network, when taking the low-rank approximation of one, this may no longer be true. Nevertheless, a similar 
analysis continues to hold even when M has negative entries. 
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We consider the partitioning [ 0 , 71 ]"* into hyper-rectangles that is induced hy S'a x • • • x S\. 

m times 

More precisely, the set of hyper-rectangles T-L is every possible direct product of intervals, i.e., 

H = X [ 02 , 62 ] X ••• X [am,bm] ■ 

Vi G [to], Voi € S\ \ {n}, bi — min{max{(l -|- s)ai, 2“-*'}, n} 

Observe that the disjoint union of those hyper-rectangles covers [0, u]™; in particular, for any 
X G {0,1}”, x^M must belong to a hyper-rectangle H GJi. 

Notice that if x^M and s lie in the same hyper-rectangle H, then they satisfy the two-sided 
error approximation guarantee of Equation (5). Our weak (1 -f e)-net has at least one such s 
for every xJ M. 

Consider any hyper-rectangle H G H with a non-empty intersection with ImM, and let 
Ih = H dim M denote their intersection. Since Ih is an intersection of convex polytopes, it 
is also a convex polytope, defined by to — r linearly independent equations that define Im M 
and 2 to inequalities that define H. Every vertex v of this polytope must lie on the intersection 
of TO linearly independent constraints: to — r equations and r inequalities. In other words, v 
lies in the intersection of at least r facets of H; i.e. there exist ii,... ,ir G [to] such that Vi^, G 
Wikjbik} Vfc G [r]. Furthermore, these r coordinates must correspond to linearly independent 
columns of M. 

Therefore, the following polynomial-time procedure (summarized in Algorithm 1) correctly 
construct a weak (1 -t e)-net: First we take a submatrix M' of M with r linearly independent 
rows, so ImM' = ImM (Line 3). Next enumerate over all (™) r-tuples of linearly independent 
columns of M' (Line 4). For each r-tuple, enumerate over all vectors in S'a x • • • x S'a (Line 5). 

r times 

Each such vector uniquely defines a point in s G if n ImM (Line 7). Finally, adding it to Se 
guarantees that we can approximate any other x^M G H (Line 7). □ 

We define 

F,(s,y) = y,(l-e-^0> (7) 

thus Fj (s, y) = Fj (x, y) for all s = x^M. Notice that the definition of Fj (s, y) naturally extends 
also to s ^ ImM. In the following lemma we relate Fj to the original fj. 

Lemma 5.4. Let e > 0, and let Se he a (1 F e)-net for M. Then for every x G {0,1}", there 
exists ans G Se such that for every y G {0,1}'" and j G [to]. 




(1- l/e-e)/j(x,y) < Fj(s,y) < fj{x,y). 

Proof. By Lemma 5.3, for any x G {0,1}", we know that there exists s G satisfying 
Inequality (4). Moreover, for any y, we claim that 

Fj{s,y) > Fj{x,y)/ {1 + e) > (1 - 1/e - e)(x, y), (8) 

where the second inequality is due to Lemma 5.2 and the fact that (1 —l/e)/(l-l-£) > (1— 1/e—e) 
for all e > 0. Recall that i/, (x, y) = 1 — e ^ thus to show the first inequality of Eq. (8), 
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ALGORITHM 1 : Construct a weak (1 + e)-net over the image of M 


Input: The matrix M of rank r, and the error factor e 
Output: The multiplicative (1 + e)-net Se- 

1 Let A he the maximum number of hits in any entry of M, and 

5a = {0, 2-^, 2-^ ■ (1 + e), 2-^ • (1 + ,..., n} 

2 5e ^0 

3 Let M' be the r X m submatrix of M with r independent row vectors vi,..., v^ G [0,1]"* 

4 for( \ 

V and the corresponding columns of M' are independent J 

/* Enumerate every r coordinates according to the grid 

5 for fci,..., fcr G 5 a do 

(tlVl H-h ZrVr)ii = kl 


6 


Construct linear system of r equations: 


(ilVl H-h ZrVr)i2 = k2 


*/ 


(ilVl H-h Zr'Vr)ir = 

/* tivi + • • ■ + 2rVr is an m-dimensional vector, and denote (tiVi + • • • + 2rVr)i as its 
i-th coordinate for each i G [m] */ 

7 Use Gaussian elimination to derive the solution (ii,..., G 

8 S ^ £ivi +-h z'^Wr, 5e •(- 5e U {s} 

9 end 

10 end 

11 return Se 


we only need to show that, 

1 _ > (^1 - /(! + £)■ 

For yj = 0, it is trivial. For y^ = 1, since (1 + e)sj > we have < e . 

Therefore, it is enough to show that (1 — e" ^ (l ~ ® / (1 + e), namely, 

e + 

-> e“ •>. 

1 + e 

Note that the above inequality holds due to Weighted AM-GM inequality > x'^y'^, 

Vx, y,a,b € K+ hy letting a = e,x = 1,6 = l,i/ = . 

Furthermore, we claim that 

fo (x, y) > Fj (x, y) > Fj (s, y) , 

where the first inequality is due to Lemma 5.2, and the second one follows from Equation 4. 
□ 

Analogously to Fj{s,y), we can also define (T(s,y) to be the expected number of (eventu¬ 
ally) activated consumer nodes given that the set of initially activated consumer nodes is 
distributed according to F(s, y). (I.e. each node j gV is independently initially activated with 
probability Fj (s, y).) For any fixed s, by the definition of Fj (s, y) in Eq. (7), we know that (t(s, y) 
is equivalent to the influence spread obtained by selecting seed set Y (indiated by y), each seed 
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j G Y being activated with probability 1 — and then influence propagated in the social 
graph G. Then, by the result in [Kempe et al. 2003], it is straightforward to see that (t(s, y) is 
monotone and submodular on y. ® 

Furthermore, our (1 + e)-net continues to (approximately) capture all possible inputs to a: 

Lemma 5.5. Let e > 0, and let be a (1 + e)-net for M. Then for every x G {0,1}", there 
exists an s G Sg such that for every y G {0,1}™, 

(1 - 1/e - e)cr(x, y) < (t(s, y) < cr(x, y). 


Proof. The right inequality follows immediately from Lemma 5.4. 

To prove the left inequality, we unfortunately need to define yet another function. For 
z G {0,1}™, let p(z) be the expected number of activated nodes given that the set of initially 
activated nodes is Z (Z Y indicated by z. Notice that p(z) is also submodular [Kempe et al. 
2003]. 


For a fractional z G [0,1]™, we extend p(z) to be the expectation over integral z G 
{0,1}"*, where each coordinate is sampled independently with expectation Zj. (I.e. p(z) = 
Ezefo.i}™ P(^) ■ n ■ (1 - -IThus, cr(x,y) = p(f(x,y)) and(T(s,y) = p(F(s,y)). 

Finally, since p(z) is the multilinear extension of a submodular function, we have (e.g. by 
Lemma 2.2 of [Vondrak 2007]) that 


(1 - 1/e - e)p (f(x, y)) < p((l - 1/e - e) • f (x, y)) < p (f(s, y)) . 


□ 


5.4. From (1 + £)-net to approximation aigorithm 

Armed with our (1 + e)-net, we can use standard submodular maximization techniques to 
approximately solve AIM-r. The full algorithm is summarized in Algorithm 2, and referred as 
Sampled Double Greedy (SDG) algorithm. 

For every s G 5e, let ys be a (1 — 1/e — e)-approximation to the feasible vector y* that 
maximizes (7(s,y). Recall that such an approximation can be found in polynomial time (with 
high probability) via standard submodular maximization techniques (see e.g. [Kempe et al. 
2003] as well as Subsection 5.5). Let (x*,y*) be the optimal feasible solution to the AIM-r 
problem. Then for some s*, we have that for every y G (0,1}"*: 

(1 - 1/e - e)cr(x*,y) < CT(s*,y) < cr(x*,y); 

and in particular, 

d(s*,y,,.) > (l-l/e-e)CT(s*,y*) > (1 - 1/e - e)^cr(x*, y*). (9) 

For each fixed yg, function cr(x, yg) is monotone and submodular on x. This is because we 
can remove all edges from U to nodes not in the set indicated by yg, and afx., yg) is the influ¬ 
ence spread of seed set X in the combined bipartite graph and social graph after removing 


®For fixed s, when we say that a vector function ^(s, y) is monotone and submodular on the indicator vector y, we 
mean that d-(s, Y) = a-(s, y) is monotone and submodular on set Y indicated by y. 
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ALGORITHM 2: SDG: A constant-factor approximation to AIM-r 

Input: Bi-adjacency matrix M, adjacency matrix P, budgets bi, 62 , accuracy parameter e > 0 
Output: Subsets (X, Y) that approximate the optimal a{X, Y) 

1 Construct (1 -I- ej-net Se from M by Algorithm 1 

2 for s £ iSe do 

3 Use greedy algorithm to find solution ys on suhmodular function (t(s, •) with budget 62 

4 Use greedy algorithm to find solution Xyg on submodular function ct(-, ys) with budget 61 

5 end 

6 return argmax(^^^ a(xy,, ys) 


those edges. Since diffusion from X in this subgraph can he viewed as IC model diffusion, hy 
[Kempe et al. 2003] we know that cr(x, ys) is monotone and suhmodular on x. Then we can 
take Xg to he a (1 — 1/e — e)-approximation to the feasible vector x* that maximizes ct(x, yg). 
Thus, for Vg., we have 

cr(xg.,yg.) > (l-l/e-e)cr(x*,yg.) > (1 - 1/e - e)(T(s*, yg.). (10) 

Taking (9) and (10) together, we have 

cr(xg.,yg.) > (1 - 1/e - e)V(x*,y*). 

This completes the proof of Theorem 5.1. □ 

5.5. Running time 

We have completely ignored the question of how to compute (t(x, y) and its relaxation (t(s, y). 
In particular, this computation is necessary for the submodular maximization procedures used 
in Algorithm 2. Although their exact computations are #P-hard [Chen et al. 2010], both can 
be efficiently approximated with arbitrarily good precision by sampling from the correspond¬ 
ing random processes. In particular, for the independent cascade model, both greedy steps in 
Algorithm 2 can apply the near-linear time algorithms in [Borgs et al. 2014; Tang et al. 2014]. 

The total running time of SDG is 0(m’’(A-|~logn)’’(6i -\-h 2 ){n-\-m + log(l/(5))), where 

0(m^(X + logn)’’e“’’) is the size of (1 + ej-net S^, and 0{e~^{bi + & 2 )(u + m + f) log(l/(5)) is 
the time running two greedy steps using the algorithm in [Tang et al. 2014], with £ being the 
total number of edges in B and G. In some situations r could be very small, e.g. r = 1 when 
the influence probability from provider i to consumer j can be approximated as the product 
of provider z’s influence strength and consumer fs susceptibility. Moreover, in practice seed 
consumers may be selected from a candidate set of size m' (e.g. the fan base of a product) much 
smaller than the social network size (m' < < m), then the dominant term rrf would be replaced 
by the much smaller {m'Y. Therefore, SDG would be efficient in these practical situations. 

6. GENERAL DIFFUSION MODEL 

Our results can be generalized to support any diffusion model in the social network G that 
has a monotone and submodular influence spread function. More specifically, for any subset 
Z C U in the social graph G, let p{Z) be the influence spread of Z in G, that is, the expected 
number of activated nodes in G after the diffusion process when Z is selected as the initial 
seed set. The diffusion model in the combined bipartite graph B and social graph G with 
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selected seed providers X and seed consumers Y is as follows: First, all seed providers in X are 
activated; and then following the prohahilities given in the hi-adjacency matrix M, a subset of 
seed consumers, Z CY, is activated (each j G F is activated independently with probability 
f (x, y) as before). Then the diffusion from Z follows the social graph diffusion model, with 
expected spread p{Z). We still use notation a{X, Y) to represent the influence spread of X and 
Y in the combined network. Then we have cf{X, Y) = ^^x[Z]p{Z), where Prx(F) is the 

probability that Z is the initially activated set in Y by provider seed set X according to the 
matrix M. 

One particular instantiation of interest is that of background propagation: each consumer 
has some initial (“background”) probability of being influenced by content providers even with¬ 
out being seeded by the advertiser; if the advertiser seeds the consumer, she has a (higher) 
boosted probability of being influenced by the content providers. To implement this model as 
a submodular influence function, we can sample the result of the diffusion process from the 
background probabilities and incorporate the expected outcome into the deflnition of p( ). 

In general, we can show that as long as p( ) is monotone, submodular, and polynomial-time 
computable, and matrix M is of constant rank as assumed before, AIM problem is still solvable 
in polynomial time. The main revision of the proof is to show that in the general model a{X, Y) 
is still monotone and submodular when we fix either X or Y. See Appendix A for details. 

7. CONCLUSION 

In this paper we propose the amphibious influence maximization (AIM) model as a proxy 
framework that combines traditional marketing via content providers together with viral mar¬ 
keting to consumers in social networks. We show that the associated computational problem 
is NP-hard to approximate to any constant factor, and provide a polynomial-time algorithm 
with (1 — 1/e — e)^ approximation ratio for any (polynomially small) e > 0 when we restrict the 
weighted bi-adjacency matrix M for the provider-consumer network to be of constant rank. 

It would be interesting to see to what extent amphibious marketing (i.e. targeting individual 
users via a combination of traditional content providers and social network viral marketing) 
can be implemented in practice. Beyond the algorithmic challenge of optimizing the sets of 
seed providers and consumers we discuss in this paper, this notion raises many interesting 
challenges in terms of learning the influence factors (the adjacency matrices in our model), 
privacy of the consumers, economic incentives, etc. 

From the perspective of theoretical computer science, we view our algorithm for AIM with 
low rank assumption as part of the ongoing effort in the community to incorporate assump¬ 
tions that are both reasonable in practice, and allow better algorithmic results. In this context 
we remark that although our low rank assumption is most natural in the context of content 
providers-consumers influence matrix, it is also closely related to another important property 
that has been observed in graphs of social networks: the eigenvalues exhibit a power law 
[Faloutsos et al. 1999; Mihail and Papadimitriou 2002]. 
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APPENDIX 

A. GENERALIZED MODEL 

In this appendix, we extend the underlying diffusion model in social graph G to allow a general 
monotone and suhmodular influence spread function p{-). We show that as long as the general 
influence spread function p( ) can he approximated in polynomial time and matrix M is of 
constant rank as assumed before, The same SDG Algorithm (except that we now need a com¬ 
putation oracle for p( ), see Algorithm 3) solves the generalized AIM problem with the same 
approximation ratio in polynomial time. We also discuss a particular consequence of this gen¬ 
eralization that allows each consumer node to have a background activation probability even 
if it is not selected as a seed. 

A.1. Definition of the generalized modei 

Instead of assuming the particular IC model, we assume that the influence spread function 
over the social graph is a general monotone and suhmodular function. Formally, we use G = 
{V, p) to denote the social graph where p is the general monotone and suhmodular function 
computing the resulting influence spread of seed consumers. Namely, given any subset Z GV, 
p{Z) is the resulting influence spread through G when Z is the set of consumers initially 
influenced by the content providers. In the following, we assume that p{Z) is monotone and 
suhmodular. 

We still use notation a{X, Y) to represent the influence spread of X and Y in the combined 
network. Then we have a{X, Y) = YIzcy ^^x[Z]p{Z), where Prx(^) is the probability that Z 
is the initially activated set in Y by provider seed set X according to the matrix M. 

Our goal is still the same as the original AIM: to find a set X of seed providers of size bi 
and a set Y of seed consumers of size 62 such that they work together to generate the largest 
influence spread, namely maximizing a{X, F). 

A.2. Result and Proof for the Extension 

We still restrict the bi-adjacency matrix M to be of constant rank r. Moreover, we assume 
that there is a value oracle O computing the general influence spread function p{Z) for any 
set Z QV with running time to- The full algorithm is summarized in Algorithm 3, and the 
only adaptation is to use the value oracle O. In particular, when we use greedy algorithm on 
function (t(s, •), ct(-, ys), we can use Monte Carlo simulation to obtain the initially activated 
set Z in G and then obtain p{Z) from the oracle, which would give us a 1 ± e approxima¬ 
tion of (t(s, y), cr(x, ys) with high probability, for any fixed s G [0, 00 )™, x, y, ys G {0,1}™. The 
theoretical guarantee is stated as follows: 

Theorem A.l. Assuming that there exists a value oracle O computing the function /o(y) for 
any indicator vector y with running time to, for any 5, e > 0 , with probability 1 — d. Algorithm 3 
solves the generalized AIM problem with constant rank-r matrix M with approximation ratio 
(1 — 1/e — e)^ and in time polynomial in n, m, A, 1/e, log(l/i5), to- 

Notice that the proof of above theorem is essentially the same as Theorem 5.1, except that 
we have to prove, for the general model, the function af, y) and its relaxation ct(s, •) are still 
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ALGORITHM 3: SDG: A constant-factor approximation for generalized AIM-r 

Input: Bi-adjacency matrix M, value oracle O that calculates p(y), budget 6 i, 62 , and parameter e 
Output: Subsets (X, Y) that can make (j{X, Y) be near-optimal value. 

1 Construct (1 -I- ej-net Se from M by Algorithm 1 

2 for s £ iSe do 

3 Use greedy algorithm with oracle O to find ys on submodular function it(s, •) with budget 62 

4 Use greedy algorithm with oracle O to find Xy^ on submodular function ct(-, ys) with budget foi 

5 end 

6 return argmax(^^^ ujxy,, ys) 


monotone and submodular. Thus it is enough for us to prove the two submodularities which 
are stated in the following lemma. 

Lemma A.2. We have the following two properties: 

(1) For any fixed y S {0,1}"*, cr(x, y) is a monotone and submodular function on x. 

(2) For any fixed s G [0, 00 )™, (t(s, y) is a monotone and submodular function on y. 

Proof. 

Property (1). For any fixed set Y, denote the random subset of F containing nodes activated 
by X as TZy^X) where the randomness comes from probabilistic edges in the bipartite graph 
B which will be sampled according to the biadjacency matrix M. Then cr(x, y) = a{X,Y) = 
E [p{TZy{X))]. Since E [p{TZy{-))] is a composition of monotone functions (E [•], p(-), and 7?.y (•)), 
it is also monotone. To show that F[p{TI.y{-))] is also submodular, we prove Yj[p{TI.y{W))] -L 
E [p(7^y(^))] > E [p(7^y(w U T))] -L E [/9(7^y (W n T))], for any W,TCU. 

Consider the fixed Y and fix any realization of live-edge bipartite graph Bl when all edges 
are sampled according to M. We use notation TZbl,y{X) to denote the set of all nodes in Y 
reachable from X in graph Bl - Next, it is enough to prove for all possible live-edge graph Bl 
and set Y C V, 

p(7Zb^,y(W)) + p(7Zb^,y(T)) > p(7Zb^,y(W U T)) + p(7Zb^,y(W n T)). (11) 

For simplicify, we omit the subscript of notation, and let TZ = TZbl,y- 
Since p{-) is also a submodular function, we have 

p(7^(W)) + p(7^(^)) > p(7^(w) u 7^(T)) + p{n{W) n 7^(^)). (12) 

Observe that 

n{W\JT)= y 7^(^t) = I y 7^(^t) j U I y 7e(u) j =7^(W) U7e(T); 

uelVUT ViiGW / VuGT / 

n{W n T) C n{W), n{T), and hence 7^(W n T) C 7^(W) n 7^(^). 

Since pf) is a monotone function, we know 

p (7^(W) u n{T)) > p{n{w u t)), and p (7^(w) n n{T)) > p{n{w n t)). 

Together with (12), Inequality (11) can be derived, which completes the proof of Property (1). 
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Property (2). Note that, a-(s,y) = ^ 


where Vrs.y[Z] denotes the 


ZQV [p{z n r) • Pr,,y[Z]J 

prohahility that Z is sampled out from V according to F(s, y). Since for every fixed set Z CV, 
p{Z n F) is a monotone and submodular function on Y, (j(s, y) is a weighted average over such 
fnncitons. Therefore (t(s, y) is also a monotone and suhmodular function. □ 


A.3. Supporting background probabilities for consumer nodes 

The above extended model also allows us to consider the following extension: we assume that 
every node v in the social graph G has an background (activation) probability by, that is, the 
probability that v can be activated as one of the infiuence spread sources is by, independent of 
whether v is selected as a consumer seed. As a result, the set Z of initially activated consumer 
nodes in G comes from two sources: a node z; is in Z either because v is selected as a consumer 
seed in Y and v is activated by some provider seed in X through the bipartite graph B with 
bi-adjacency matrix M, or v is activated independently by a background probability p„. Then 
the final infiuence spread is p{Z) once Z is determined. 

This extension covers the realistic cases where a consumer may pay attention to the ad¬ 
vertiser’s campaign anyway (either from content providers or any other unspecified sources) 
whether or not she is selected as a seed, but if she is selected as a seed, she will pay more 
attention to the selected content providers and her probability of propagating the campaign is 
boosted. 

For convenience, denote the vector of all background probabilities as b. We use notation 
(t'{X, Y) to represent the influence spread of X and Y in the combined network with back¬ 
ground probabilities. Then we have cr'{X, Y) = Y1iZ<zy Prb[-^o] Prx[Z]p{Z U Zq), where 

Prb(Zo) is the probability that Zq is sampled out from V as the intially activated node set 
according to b, and Prx(Z) is the probability that Z is sampled out from Y as the initially 
activated node set activated by the provider seed set X according to the matrix M. 

We now show that this extension a'(X,Y) can be treated as a special case of the general 
model defined in Section A.l. By the definition of a'{X, Y), we have 

a'iX,Y)= E E E Pr[ZoKZUZo). 

ZqCV ZCY ZCY ZoQV 

Define p'{Z) = J2zoCv Pi'b[^o]p(^ U Zq), then we have cr'(A, Y) = J 2 zcy P"‘^x[Z]p'{Z). Hence, 
a'{X, Y) can be viewed as the final influence spread in the general model defined in Section A.l 
with p' as the influence spread in the social network G. Since p is monotone and submodular, 
it is straightforward to check that p{Z U Zq) is monotone and submodular in Z for any Zg, and 
thus p'(Z) as a non-negative linear combination of /o(Z U Zgj’s is also monotone and submodu¬ 
lar. Therefore, the extension with background probabilities can indeed be treated as a special 
case of the general model, and thus the algorithm and result in Section A.2 can cover this 
further extension. The only thing is that to compute p'{Z), we may need to combine Monte 
Carlo simulations for set Zg with the computation oracle for p( ) to get an accurate estimate 
for p'{Z). 
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B. HARDNESS OF APPROXIMATION RESULT FOR ACCEPTANCE PROBABILITY MAXIMIZATION 

In this appendix, we apply ideas from our hardness for AIM to prove the hardness of ap¬ 
proximation result for the problem of acceptance probability maximization (APM) studied hy 
[Yang et al. 2013] in the context of active friending. In APM, an initiator s tries to find k nodes 
in a social network to send friending requests to in order to maximize the eventual acceptance 
prohahility of a target node t, when s finally sends a friending request to t. In this model, if s 
sends a friending request to a non-friend v in the network, then the common friends of s and v 
would each independently influence v to accept the request from s; once v accepts the request, 
the influence can further propagate to v’s friends who also receive friending requests from s. 
Technically, the diffusion is formulated as following the independent cascade (IC) model and 
the maximization problem is equivalent to finding a subgraph such that the activation prob¬ 
ability of target t is maximized when diffusion only propagates in this subgraph from seed 
nodes to t, where seed nodes are essentially the original friends of source node s. We formally 
restate the APM problem below. 

Definition B.l {Acceptance Probability Maximization (APM) [Yang et al. 2013]). Given a 
graph G = {V, E) with independent probabilities pe on the edges, seed set S C V, a target 
node t e V \ S, and a budget B. The output of APM is a subset W C V of size \W\ < B. Let 
G(5'UWU{f}) be the subgraph of G induced by nodes in S'UWU{f}, and suppose that influence 
diffusion in G{S UW U {f}) follows the independent cascade model with edge probabilities pe 
for every edge e in the subgraph G{S U IP U {f}). The goal of APM is to maximize the acti¬ 
vation probability of t when influence diffusion is from the seed set S and is restricted to the 
subgraph G{S UW U {f}). 

The APM problem bears similarities to the AIM problem — both are maximizing the effect 
of influence diffusion, both need to select certain number of nodes with respect to the budget 
constraint, and the influence diffusion in both problems are restricted in some way by the se¬ 
lected nodes. However, they differ in two important aspects: first, APM restricts the influence 
diffusion within the selected subgraph, while AIM only restricts diffusion from the selected 
seed providers to selected seed consumers, but from seed consumers, the diffusion can reach 
all other nodes in the social network; second, APM uses one budget for selecting the subgraph, 
while AIM uses two separate budgets on seed providers and seed consumers respectively. 

The differences in the two problems prevent us from providing a black box reduction be¬ 
tween the two problems, but their similarities allow us to apply the techniques from AIM 
hardness to APM hardness. Moreover, by exploiting the fact that APM restricts the diffusion 
to the selected subgraph from an arbitrary input graph, we are able to amplify the constant- 
factor hardness result of AIM (Theorem 4.1) to get an even stronger inapproximability result 
for APM: 

Theorem B.2. For any constant e > 0, APM over general graph G is NP-hard to approxi¬ 
mate to within factor 2“"*^ where n is the number of nodes in G. 

The rest of this appendix is devoted to the proof of Theorem B.2. In the next subsection we 
prove that in the special case of a three-layer graph (when disregarding the single source node 
s and the single target node t), APM is N P-hard to approximate to within any constant factor 
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(Lemma B.3). This proof is almost identical to the proof of our main hardness result for AIM 
(Theorem 4.1). Then, in Subsection B.2 we concatenate instances of three-layer APM to 
achieve exponential hardness. 

B.1. Constant factor hardness for three-layer APM 

In this subsection we prove that in the special case of a three-layer graph (when excluding the 
single seed node s and the single target node t), APM is N P-hard to approximate to within any 
constant factor. In fact, it will be convenient to prove the following slightly stronger bi-criteria 
inapproximability: 

Lemma B.3. Let a > 0 be any constant. Given a budget B and a three-layer graph G = 
{V, E) (s.t. V = {s} U (7 U Pi U 14 U {t}), it is NP-hard to distinguish between the following: 

Completeness, the associated APM instance has value at least 1 /3; and 
Soundness, even with budget 2B, the associated APM instance has value at most a. 

Proof. Our proof is very similar to the proof of Theorem 4.1. The main difference is that for 
the soundness we need to rule out solutions that perform much better using additional budget. 
This additional budget comes from having 2B budget instead of B; from allowing additional 
budget to seed the nodes in V 2 (in AIM those nodes are “free”); and from transferring budgets 
between layers (in AIM the partitioning of budget between layers is fixed by the instance). In 
particular, to overcome the latter problem we create many copies of U and set the parameters 
so that the optimal solution uses approximately the same fraction of the budget in each layer. 
The result will follow by observing that increasing the budget on any layer by a constant 
factor increases the probability of acceptance by at most a constant value. While the proof 
in this section is self-contained, we encourage the reader to refer back to the description of 
Feige’s fc-prover proof system in Section 4; in particular, k, I, Q, and R below are parameters 
of the fc-prover proof system. 

Construction. We let the seed set contain a single vertex S = {s} (this is without loss of 
generality). We then construct three layers: U,Vi,V 2 . The source node s is connected to all 
nodes u G U with probability 1, and each node in V 2 is connected to the target node t with 
probability 1/((1 — 1/e) A). 

Going back to the fc-prover system, the top layer U corresponds to triplets of provers’ an¬ 
swers to questions; the middle layer Vi corresponds to assignments to variables -distinguished 
and non-distinguished- that may appear in the verifier’s question to any of the provers; finally, 
the bottom layer corresponds to the random strings of the verifier. All the edges go from the 
top to the middle layer, or from the middle to the bottom layer. 

More specifically, for each triplet {q,a,i) of (question, answer, prover) we have rj = R/{kQ) 
corresponding nodes in U. For each pair (r,^) of (verifier’s random string, assignment to 
all 31 variables) we have a node in Vi. Notice that this is different from [Feige 1998], where 
the elements to be covered correspond to {r,ar,i) with Ur being the assignment only for the 
distinguished variables. For every h G [ 77 ], the node ( 9 , a,i,h) is connected to all the nodes (r, oT) 
such that: (g, i) S r, and when restricting oT to the variables specified by (g, i), it is equal to a. 
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Table II: Summary of notation for Lemma 


{q,a,i,h) 

question, answer, prover, copy 

1 vertex in U 

{q,a,i) 

question, answer, prover 

r] vertices in U 

{q,i,h) 

question, prover, copy 

2^72 vertices in U 

(qb) 

question, prover 

2^72 . ^ vertices in U 

(r, h) 

random string, copy 

k ■ 2^72 vertices in U 
(V {q,i) G r and a G {0, 1 }^*/^) 

(r, 07) 

random string, assignment to ah il variables 

1 vertex in Fi 

r 

random string 

1 vertex in F 2 


In particular, for each i, each (r,^) corresponds to only one (g, a, i) (and thus ry different nodes 
(q, a, i, h)). Finally, all the edges from U to Vi have probability l/(Tjk). 

For each random string r, we have one node in the bottom layer, V 2 . The node corresponding 
to each r is connected to all the nodes (r, a 7 ) in Vi with probability 1 . The role of this layer is to 
force any good assignment to spread its budget across the different random strings (i.e. make 
sure that the provers answer all the questions). 

Finally, we set the budget B = 3 R. See Table II for a summary of notation. 

B.1.1. Completeness. Given a satisfiable assignment to the 3SAT-5 formula, in the top layer 
we let IF n {7 be the pkQ nodes that correspond to the same assignment. Because they all 
correspond to the same assignment, for each random string r, all ryfc corresponding nodes 
in S are connected to the common node (r, a^*). In the middle layer, we let IF n Fi be the 
set of these R nodes (i.e. for r e R). Before sampling the edges, each (r, a^*) has ryfc 

neighbors in IF n C/. After sampling, the probability that there is a path from IF n ?7 to 
isl—(l—— 1/e. In particular, with high probability approximately (1 — l/e)i? of the 
nodes in IF n Fi are activated (e.g. via Chernoff bound). 

Finally, we let IF n F2 = F2. With high probability, approximately (1 — l/e)i? of them are 
activated. Thus the probability that t is activated is given by 1— (l — 1/((1 — l/e)i?))^^ i/e)R ^ 
1 - 1 /e. 


B.1.2. Soundness. Let OPT denote the optimum value (using budget 2 B = 6R on a “no” in¬ 
stance), and let OPT{Bi, B2, B3) denote the optimum value among assignments that spend 
budget Bi on the i-th layer. Clearly, OPT < OPT{6R,6R,6R) since adding nodes can only 
increase the value. In fact, any solution can spend at most R budget on the last layer, so 
OPT < OPT{6R, 6R, R). Now, observe that if we hx IF fl (Fi U F2), the probability of activat¬ 
ing t is a monotone submodular function of IF fl t/. Thus OPT < ^OPT{R, 6R, R). Similarly, 
when we hx IF n (?7 U F2), the probability of activating t is a monotone submodular function of 
WnVi. Therefore, OPT < ^OPT{R, R, R). In particular, it suffices to show that OPT{R, R, R) 
is bounded by an arbitrarily small constant. 

In an unsatishable instance, any two provers agree for at most a (2“'^^)-fraction of the ran¬ 
dom strings. We will show in Lemma 1 that there are at most (2 • • R) good random 

strings r, which are strings r such that there is a node (r, 7iT) with more than 2ri neighbors in 
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W C\U. Since for each random string r there is only one node in V 2 , each of the good random 
strings contributes at most one to the number of activated neighbors of t. Before sampling the 
edges, any node that does not correspond to a good random string has at most 2?7 neighbors in 
W CiU. After sampling the edges between U and Vi, the probability that any such node has 
a neighbor in Th n C/ is at most 2/k. Again, each such node can contribute at most one to the 
number of activated neighbors of t. In total, the number of activated neighbors of t is bounded 
by: 

(# of good strings) + t (# of bad strings) < 2 • • R + jR < ^R. 

Recall that each neighbor activates t with probability 1/((1 — l/e)R). Therefore, by union 
bound, the probability that any of the |i? activated neighbors propagates to t is at most 

fe(l-l/e) < 

□ 

Lemma 1. There are at most (2 • ■ R^ good random strings. 

Proof. Intuitively, any (r, oT) which has 2r] neighbors in IP n C/ corresponds to an agree¬ 
ment of at least two provers - and therefore should be a rare event. In order to turn this intu¬ 
ition into a proof, we must rule out solutions that distribute the budget in an uneven manner 
that does not correspond to answers of provers to verifier’s questions. Fix any assignment 
to the “no” instance. In the next few paragraphs, we repeatedly apply Markov’s inequality to 
bound the number of: “heavy {q, i)” for which the assignment allocates 2 (^/®)°Ltimes more than 
the expected budget; “heavy {q,i, h)”, for which 2 (^/^FLtimes more than the expected budget 
is allocated; and “good (r, h)” for which two provers agree, i.e. some node (r,^) has more than 
one neighbor {q, a, i, /i) in IP n U. 

For any prover i, there are at most rjkQ corresponding nodes in IP n [/, so at most iqk in 
expectation over q. By Markov’s inequality, for at most a 2“(^/®)'^Lf]-action of g’s, more than 
nodes belong to TPnt/; we call those {q, i)’s, heavy, and light otherwise. We henceforth 
focus on bounding the number of good random strings that correspond only to light {q, i)’s. 

Pr [3i : {q, i) is heavy] < 2“*^^^®^'^^ • k. 

r 

Recall that for each triplet {q,a,i), we have rj nodes in U (with identical neighborhoods). 
For 1 < h < 77 , we label the h-th. such node by {q,a,i, h). Fix any light {q,i). For each h, in 
expectation, WnU contains at most nodes {q, a, i, h). Using Markov’s inequality again, 

for at most a 2“(^/®^°hf]-action of the /I’s, IP n U contains more than 2*^^/®^'’* • k nodes {q, a, i, h). 
We abuse notation and call any such triplet {q, i, h) heavy, and light otherwise. In particular, 
for any r such that all the corresponding (g, i)’s are light, at most a 2 “(^/®^°Lf]-action of the 
corresponding {q,i,hys are heavy. For each heavy {q,i,h), any (r, oT) has only one neighbor 
{q, a, i, h). Thus to each (r, oT), all the heavy {q, i, /i)’s together contribute at most . fj 

neighbors in IPfl C/. We henceforth ignore the heavy {q, i, /i)’s, and add these • 77 nodes 

at the end. 

V (r, oT) #1 {q, a, i, h) {q, a, i,h) G Af (r,a^) fl (IP fl [/) and (q, i, h) is heavyj < • rj. 
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Consider only light {q, i, ii)’s. Then for each h and light {q, i), there are at most nodes 

(q, a, i, ii) in fh n U. In other words, for each h, each prover has at most ■ k answers to 

each question. Since we started from an unsatishable instance, we have that for any pair 
of provers, at most a 2“'^^ • • fc) -fraction of random strings have at least one pair of 

agreeing answers (Theorem 4.2). Keeping h hxed and summing over all pairs of provers, this 
corresponds to a -fraction of random strings r such that any node (r, o^) has more 

than one neighbor {q, a, i, /i) in IT n U. We say that a pair (r, h) is good if for some the node 
(r,^) has more than one neighbor (g, a, i, /i) in IT n U. 

Pr [(r, h) is good] < 2“^^/^^'^^ • k^. 

r,h 

Finally, for each random string r, in expectation, at most a 2“^^/^)°* • /c^-fraction of the h’s 
satisfy (r, h) is good. Applying Markov’s inequality one more time, we have that for at most 
a 2“(^/®)‘’Cf]-action of the r’s, for more than a 2“^^/®)'^* • fc'^-fraction of the h’s, (r, h) is good. We 
claim that these r’s, together with the ones that correspond to heavy (g, i)’s, are the only good 
random strings. Notice that there are at most 2 • • R of them. 

It is left to prove that if (r, h) is good for at most a 2“^^/®^°* • ■qk'^ of the h’s, then r cannot be a 
good random string. For each (r, a^), each good (r, h) contributes at most k neighbors in IT fl [/. 
Together with additional 2~'d/^)<^^qk neighbors due to heavy (g, i, h)’s and a single neighbor for 
each other h, we have that the number of neighbors of (r, in IT n C/ is at most 

(l -L • k^ + q < 2q. 

□ 

B.2. Exponential factor hardness 

We are now ready to complete the proof of Theorem B.2. We concatenate copies of the 
hard 3-layer APM instance, each of size rf. (So that the total number of nodes is = n, 

and the blowup in size is polynomial in n®, for any constant e.) Specihcally, by concatenation 
we mean that we identify ti, the target node of the i-th copy, with s^+i, the source node of the 
i + 1-th copy. The total budget is set to n^~'^{3R + 1). 

Completeness. If we can achieve value 1/2 on each copy, the final activation probability of 

is 2“"^ '. 

Soundness. We can allocate budget greater than 6i? to at most half the instances. On the 
other half of the instances we would achieve value at most a, where a is an arbitrarily 
small constant which depends on our instantiation of the 3-layer APM (in particular, a = 
1/16 suffices). Therefore, the hnal activation probability is at most 


□ 


Remark. One can easily generalize the APM problem to support a target set T of nodes with 
the goal of maximizing the expected number of active nodes in the intersection of the target 
set and the selected set, dehned as APM-m problem below. 
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Definition B.4. [APM-m] Given a graph G = {V, E) with independent prohahilities Pe on 
the edges, seed set S GV, target set T CV \ S, and a budget B. The problem of APM-m is to 
find a subset IP c P of size |1P| < B. Let G{S U W) be the subgraph of G induced by nodes 
in S' U IP, and suppose that influence diffusion in G{S U IP) follows the independent cascade 
model with edge probabilities pe for every edge e in the subgraph G(SUlP). The goal of APM-m 
is to maximize the expected number of active nodes in T n IP when influence diffusion is from 
the seed set S and is restricted to the subgraph G{S U IP). 

Since APM-m is a generalization of APM with a single target, the near-exponential hardness 
of APM directly applies to this generalization. We further remark that the proof of the con¬ 
stant factor hardness of APM for three-layer graphs can be adapted to show that the constant 
factor hardness of APM-m for three-layer graphs (with one additional node as the single seed 
connecting to all first layer nodes with edge probability 1, and the third-layer nodes as the 
targets). 



